{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setting up Python for machine learning: scikit-learn and Jupyter Notebook ([video #2](https://www.youtube.com/watch?v=IsXXlYVBt1M&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=2))\n",
    "\n",
    "Created by [Data School](http://www.dataschool.io/). Watch all 9 videos on [YouTube](https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A). Download the notebooks from [GitHub](https://github.com/justmarkham/scikit-learn-videos).\n",
    "\n",
    "**Note:** Since the video recording, the official name of the \"IPython Notebook\" was changed to \"Jupyter Notebook\". However, the functionality is the same."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Agenda\n",
    "\n",
    "- What are the benefits and drawbacks of scikit-learn?\n",
    "- How do I install scikit-learn?\n",
    "- How do I use the Jupyter Notebook?\n",
    "- What are some good resources for learning Python?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![scikit-learn algorithm map](images/02_sklearn_algorithms.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Benefits and drawbacks of scikit-learn\n",
    "\n",
    "### Benefits:\n",
    "\n",
    "- **Consistent interface** to machine learning models\n",
    "- Provides many **tuning parameters** but with **sensible defaults**\n",
    "- Exceptional **documentation**\n",
    "- Rich set of functionality for **companion tasks**\n",
    "- **Active community** for development and support\n",
    "\n",
    "### Potential drawbacks:\n",
    "\n",
    "- Harder (than R) to **get started with machine learning**\n",
    "- Less emphasis (than R) on **model interpretability**\n",
    "\n",
    "### Further reading:\n",
    "\n",
    "- Ben Lorica: [Six reasons why I recommend scikit-learn](http://radar.oreilly.com/2013/12/six-reasons-why-i-recommend-scikit-learn.html)\n",
    "- scikit-learn authors: [API design for machine learning software](http://arxiv.org/pdf/1309.0238v1.pdf)\n",
    "- Data School: [Should you teach Python or R for data science?](http://www.dataschool.io/python-or-r-for-data-science/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![scikit-learn logo](images/02_sklearn_logo.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing scikit-learn\n",
    "\n",
    "**Option 1:** [Install scikit-learn library](http://scikit-learn.org/stable/install.html) and dependencies (NumPy and SciPy)\n",
    "\n",
    "**Option 2:** [Install Anaconda distribution](https://www.anaconda.com/download/) of Python, which includes:\n",
    "\n",
    "- Hundreds of useful packages (including scikit-learn)\n",
    "- IPython and Jupyter Notebook\n",
    "- conda package manager\n",
    "- Spyder IDE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Jupyter logo](images/02_jupyter_logo.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the Jupyter Notebook\n",
    "\n",
    "### Components:\n",
    "\n",
    "- **IPython interpreter:** enhanced version of the standard Python interpreter\n",
    "- **Browser-based notebook interface:** weave together code, formatted text, and plots\n",
    "\n",
    "### Installation:\n",
    "\n",
    "- **Option 1:** [Install the Jupyter notebook](https://jupyter.readthedocs.io/en/latest/install.html) (includes IPython)\n",
    "- **Option 2:** Included with the Anaconda distribution\n",
    "\n",
    "### Launching the Notebook:\n",
    "\n",
    "- Type **jupyter notebook** at the command line to open the dashboard\n",
    "- Don't close the command line window while the Notebook is running\n",
    "\n",
    "### Keyboard shortcuts:\n",
    "\n",
    "**Command mode** (gray border)\n",
    "\n",
    "- Create new cells above (**a**) or below (**b**) the current cell\n",
    "- Navigate using the **up arrow** and **down arrow**\n",
    "- Convert the cell type to Markdown (**m**) or code (**y**)\n",
    "- See keyboard shortcuts using **h**\n",
    "- Switch to Edit mode using **Enter**\n",
    "\n",
    "**Edit mode** (green border)\n",
    "\n",
    "- **Ctrl+Enter** to run a cell\n",
    "- Switch to Command mode using **Esc**\n",
    "\n",
    "### IPython, Jupyter, and Markdown resources:\n",
    "\n",
    "- [nbviewer](http://nbviewer.jupyter.org/): view notebooks online as static documents\n",
    "- [IPython documentation](http://ipython.readthedocs.io/en/stable/)\n",
    "- [Jupyter Notebook quickstart](http://jupyter.readthedocs.io/en/latest/content-quickstart.html)\n",
    "- [GitHub's Mastering Markdown](https://guides.github.com/features/mastering-markdown/): short guide with lots of examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Resources for learning Python\n",
    "\n",
    "- [Codecademy's Python course](https://www.codecademy.com/learn/learn-python): browser-based, tons of exercises\n",
    "- [DataQuest](https://www.dataquest.io/): browser-based, teaches Python in the context of data science\n",
    "- [Google's Python class](https://developers.google.com/edu/python/): slightly more advanced, includes videos and downloadable exercises (with solutions)\n",
    "- [Python for Everybody](https://www.py4e.com/): beginner-oriented book, includes slides and videos"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comments or Questions?\n",
    "\n",
    "- Email: <kevin@dataschool.io>\n",
    "- Website: http://dataschool.io\n",
    "- Twitter: [@justmarkham](https://twitter.com/justmarkham)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       "    @font-face {\n",
       "        font-family: \"Computer Modern\";\n",
       "        src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "    }\n",
       "    div.cell{\n",
       "        width: 90%;\n",
       "/*        margin-left:auto;*/\n",
       "/*        margin-right:auto;*/\n",
       "    }\n",
       "    ul {\n",
       "        line-height: 145%;\n",
       "        font-size: 90%;\n",
       "    }\n",
       "    li {\n",
       "        margin-bottom: 1em;\n",
       "    }\n",
       "    h1 {\n",
       "        font-family: Helvetica, serif;\n",
       "    }\n",
       "    h4{\n",
       "        margin-top: 12px;\n",
       "        margin-bottom: 3px;\n",
       "       }\n",
       "    div.text_cell_render{\n",
       "        font-family: Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
       "        line-height: 145%;\n",
       "        font-size: 130%;\n",
       "        width: 90%;\n",
       "        margin-left:auto;\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    .CodeMirror{\n",
       "            font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
       "    }\n",
       "/*    .prompt{\n",
       "        display: None;\n",
       "    }*/\n",
       "    .text_cell_render h5 {\n",
       "        font-weight: 300;\n",
       "        font-size: 16pt;\n",
       "        color: #4057A1;\n",
       "        font-style: italic;\n",
       "        margin-bottom: 0.5em;\n",
       "        margin-top: 0.5em;\n",
       "        display: block;\n",
       "    }\n",
       "\n",
       "    .warning{\n",
       "        color: rgb( 240, 20, 20 )\n",
       "        }\n",
       "</style>\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                        TeX: {\n",
       "                           extensions: [\"AMSmath.js\"]\n",
       "                           },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.core.display import HTML\n",
    "def css_styling():\n",
    "    styles = open(\"styles/custom.css\", \"r\").read()\n",
    "    return HTML(styles)\n",
    "css_styling()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
